Fitting Aggregation Functions to Data: Part II - Idempotization

نویسندگان

  • Maciej Bartoszuk
  • Gleb Beliakov
  • Marek Gagolewski
  • Simon James
چکیده

The use of supervised learning techniques for fitting weights and/or generator functions of weighted quasi-arithmetic means – a special class of idempotent and nondecreasing aggregation functions – to empirical data has already been considered in a number of papers. Nevertheless, there are still some important issues that have not been discussed in the literature yet. In the second part of this two-part contribution we deal with a quite common situation in which we have inputs coming from different sources, describing a similar phenomenon, but which have not been properly normalized. In such a case, idempotent and nondecreasing functions cannot be used to aggregate them unless proper pre-processing is performed. The proposed idempotization method, based on the notion of B-splines, allows for an automatic calibration of independent variables. The introduced technique is applied in an R source code plagiarism detection system.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fitting Aggregation Functions to Data: Part I - Linearization and Regularization

The use of supervised learning techniques for fitting weights and/or generator functions of weighted quasi-arithmetic means – a special class of idempotent and nondecreasing aggregation functions – to empirical data has already been considered in a number of papers. Nevertheless, there are still some important issues that have not been discussed in the literature yet. In the first part of this ...

متن کامل

An Application of Discounted Residual Income for Capital Assets Pricing by Method Curve Fitting with Sinusoidal Functions

The basic model for valuation of firm is the Dividend Discount Model (DDM). When investors buy stocks, they expect to receive two types of cash flow: dividend in the period during which the stock is owned, and the expected sales price at the end of the period. In the extreme example, the investor keeps the stock until the company is liquidated; in such a case, the liquidating dividend becomes t...

متن کامل

Extensible Grouping and Aggregation for Data Reconciliation

New applications from the areas of analytical data processing and data integration require powerful features to condense and reconcile available data. Object-relational and other data management systems available today provide only limited concepts to deal with these requirements. The general concept of grouping and aggregation appears to be a fitting paradigm for a number of the mentioned issu...

متن کامل

Extensible and Similarity-based Grouping for Data Integration

The general concept of grouping and aggregation appears to be a fitting paradigm for various issues in data integration, but in its common form of equality-based grouping a number of problems remain unsolved. We propose a generic approach to user-defined grouping as part of a SQL extension, allowing for more complex functions, for instance integration of data mining algorithms. Furthermore, we ...

متن کامل

A continuous approximation fitting to the discrete distributions using ODE

The probability density functions fitting to the discrete probability functions has always been needed, and very important. This paper is fitting the continuous curves which are probability density functions to the binomial probability functions, negative binomial geometrics, poisson and hypergeometric. The main key in these fittings is the use of the derivative concept and common differential ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016